13 research outputs found

    Near-field Perception for Low-Speed Vehicle Automation using Surround-view Fisheye Cameras

    Full text link
    Cameras are the primary sensor in automated driving systems. They provide high information density and are optimal for detecting road infrastructure cues laid out for human vision. Surround-view camera systems typically comprise of four fisheye cameras with 190{\deg}+ field of view covering the entire 360{\deg} around the vehicle focused on near-field sensing. They are the principal sensors for low-speed, high accuracy, and close-range sensing applications, such as automated parking, traffic jam assistance, and low-speed emergency braking. In this work, we provide a detailed survey of such vision systems, setting up the survey in the context of an architecture that can be decomposed into four modular components namely Recognition, Reconstruction, Relocalization, and Reorganization. We jointly call this the 4R Architecture. We discuss how each component accomplishes a specific aspect and provide a positional argument that they can be synergized to form a complete perception system for low-speed automation. We support this argument by presenting results from previous works and by presenting architecture proposals for such a system. Qualitative results are presented in the video at https://youtu.be/ae8bCOF77uY.Comment: Accepted for publication at IEEE Transactions on Intelligent Transportation System

    An Online Learning System for Wireless Charging Alignment using Surround-view Fisheye Cameras

    Full text link
    Electric Vehicles are increasingly common, with inductive chargepads being considered a convenient and efficient means of charging electric vehicles. However, drivers are typically poor at aligning the vehicle to the necessary accuracy for efficient inductive charging, making the automated alignment of the two charging plates desirable. In parallel to the electrification of the vehicular fleet, automated parking systems that make use of surround-view camera systems are becoming increasingly popular. In this work, we propose a system based on the surround-view camera architecture to detect, localize, and automatically align the vehicle with the inductive chargepad. The visual design of the chargepads is not standardized and not necessarily known beforehand. Therefore, a system that relies on offline training will fail in some situations. Thus, we propose a self-supervised online learning method that leverages the driver's actions when manually aligning the vehicle with the chargepad and combine it with weak supervision from semantic segmentation and depth to learn a classifier to auto-annotate the chargepad in the video for further training. In this way, when faced with a previously unseen chargepad, the driver needs only manually align the vehicle a single time. As the chargepad is flat on the ground, it is not easy to detect it from a distance. Thus, we propose using a Visual SLAM pipeline to learn landmarks relative to the chargepad to enable alignment from a greater range. We demonstrate the working system on an automated vehicle as illustrated in the video at https://youtu.be/_cLCmkW4UYo. To encourage further research, we will share a chargepad dataset used in this work.Comment: Accepted for publication at IEEE Transactions on Intelligent Transportation System

    Classification of electromagnetic interference induced image noise in an analog video link

    No full text
    With the ever-increasing electrification of the vehicle showing no sign of retreating, electronic systems deployed in automotive applications are subject to more stringent Electromagnetic Immunity compliance  constraints than ever before, to ensure the proximity of nearby electronic systems will not affect their operation. The EMI compliance testing of an analog camera link requires video quality to be monitored and  assessed to validate such compliance, which up to now, has been a manual task. Due to the nature of human  interpretation, this is open to inconsistency. Here, we propose a solution using deep learning models that  analyse, and grade video content derived from an EMI compliance test. These models are trained using a dataset built entirely from real test image data to ensure the accuracy of the resultant model(s) is maximised. Starting with the standard AlexNet, we propose four models to classify the EMI noise level. </p

    Machine Learning for Healthcare-IoT Security: A Review and Risk Mitigation

    No full text
    The Healthcare Internet-of-Things (H-IoT), commonly known as Digital Healthcare, is a data-driven infrastructure that highly relies on smart sensing devices (i.e., blood pressure monitors, temperature sensors, etc.) for faster response time, treatments, and diagnosis. However, with the evolving cyber threat landscape, IoT devices have become more vulnerable to the broader risk surface (e.g., risks associated with generative AI, 5G-IoT, etc.), which, if exploited, may lead to data breaches, unauthorized access, and lack of command and control and potential harm. This paper reviews the fundamentals of healthcare IoT, its privacy, and data security challenges associated with machine learning and H-IoT devices. The paper further emphasizes the importance of monitoring healthcare IoT layers such as perception, network, cloud, and application. Detecting and responding to anomalies involves various cyber-attacks and protocols such as Wi-Fi 6, Narrowband Internet of Things (NB-IoT), Bluetooth, ZigBee, LoRa, and 5G New Radio (5G NR). A robust authentication mechanism based on machine learning and deep learning techniques is required to protect and mitigate H-IoT devices from increasing cybersecurity vulnerabilities. Hence, in this review paper, security and privacy challenges and risk mitigation strategies for building resilience in H-IoT are explored and reported

    Feasibility Study of V2X Communications in Initial 5G NR Deployments

    No full text
    Advancements in intelligent vehicles and Intelligent Transport Systems (ITS) have shown that they are now feasible in both technology and commerce. However, there are still significant challenges to overcome, particularly regarding the perception and coordination of intelligent vehicles in unfavourable conditions. Vehicle-to-Everything (V2X) communications is a technology that aims to enable intelligent vehicles to communicate with other road users and infrastructure to increase their range of perception and coordination capabilities. While the 4th generation of cellular technology (4G LTE) is capable of supporting V2X communications to some extent, its multimedia and telephony-centric design does not translate well to safety-critical applications. As a result, the 5th generation of cellular technology (5G NR) is being developed to improve V2X communications. To investigate the effectiveness of 5G NR in V2X communications, a driving-based measurement campaign of a commercial cellular network with early 5G NR deployments was conducted. Results showed that the existing 4G LTE network is limited in its capability, and early 5G NR deployments can in fact outperform it. However, neither 4G LTE nor 5G NR can reliably support advanced V2X applications. Early 5G NR deployments suffer from significant reliability issues compared to existing 4G LTE deployments. These reliability issues are of particular concern, as they impact the vehicle&#x2019;s ability to trust the information it receives. These findings highlight the need for further design and implementation of intelligent vehicles and future 5G NR networks to address these reliability concerns and ensure the safe and efficient operation of intelligent vehicles in all conditions

    Quantifying the effects of ground truth annotation quality on object detection and instance segmentation performance

    No full text
    Fully-supervised object detection and instance segmentation models have accomplished notable results on large-scale computer vision benchmark datasets. However, fully-supervised machine learning algorithms’ performances are immensely dependent on the quality of the training data. Preparing computer vision datasets for object detection and instance segmentation is a labor-intensive task requiring each instance in an image to be annotated. In practice, this often results in the quality of bounding box and polygon mask annotations being suboptimal. This paper quantifies empirically the ground truth annotation quality and COCO’s mean average precision (mAP) performance by introducing two separate noise measures, uniform and radial, into the ground truth bounding box and polygon mask annotations forthe COCO and Cityscapes datasets. Mask-RCNN models are trained on various levels of noise measures to investigate the performance of each level of noise. The results showed degradation of mAP as the level of both noise measures increased. For object detection and instance segmentation respectively, using the highest level of noise measure resulted in a mAP degradation of 0.185 & 0.208 for uniform noise with reductions of 0.118 & 0.064 for radial noise on the COCO dataset. As for the Cityscapes datasets, reductions of mAP performance of 0.147 & 0.142 for uniform noise and 0.101 & 0.033 for radial noise were recorded. Furthermore, a decrease in average precision is seen across all classes, with the exception of the class motorcycle. The reductions between classes vary, indicating the effects of annotation uncertainty are class-dependent.</p

    UnShadowNet: Illumination critic guided contrastive learning for shadow removal

    No full text
    Shadows are frequently encountered natural phenomena that significantly hinder the performance of computer vision perception systems in practical settings, e.g., autonomous driving. A solution to this would be to eliminate shadow regions from the images before the processing of the perception system. Yet, training such a solution requires pairs of aligned shadowed and non-shadowed images which are difficult to obtain. We introduce a novel weakly supervised shadow removal framework UnShadowNet trained using contrastive learning. It is composed of a DeShadower network responsible for the removal of the extracted shadow under the guidance of an Illumination network which is trained adversarially by the illumination critic and a Refinement network to further remove artefacts. We show that UnShadowNet can be easily extended to a fully-supervised set-up to exploit the ground-truth when available. UnShadowNet outperforms existing state-of-the-art approaches on three publicly available shadow datasets (ISTD, adjusted ISTD, SRD) in both the weakly and fully supervised setups.</p

    Deep multi-task networks for occluded pedestrian pose estimation

    No full text
    Most of the existing works on pedestrian pose estimation do not consider estimating the pose of an occluded pedestrian, as the annotations of the occluded parts are not available in relevant automotive datasets. For example, CityPersons, a well-known dataset for pedestrian detection in automotive scenes does not provide pose annotations, whereas MS-COCO, a non-automotive dataset, contains human pose estimation. In this work, we propose a multi-task framework to extract pedestrian features through detection and instance segmentation tasks performed separately on these two distributions. Thereafter, an encoder learns pose specific features using an unsupervised instance-level domain adaptation method for the pedestrian instances from both distributions. The proposed framework has improved state-of-the-art performances of pose estimation, pedestrian detection, and instance segmentation. </p

    Detecting the overfilled status of domestic and commercial bins using computer vision

    No full text
    As the amount of waste being produced globally is increasing, there is a need for more efficient waste management solutions to accommodate this expansion. The first step in waste management is the collection of bins or containers. Each bin truck in a fleet is assigned a collection route. As the bin trucks have a finite amount of storage for waste, accepting overfilled bins may result in filling this storage before the end of the collection route. This creates inefficiencies as a second bin truck is needed to finish the collection route if the original becomes full. Currently, the recording and tracking of overfilled bins is a manual process, requiring the bin truck operator to undertake this task, resulting in longer collection route durations. To create a more efficient and automated process, computer vision methods are considered for the task of detecting the bin status. Video footage from a commercial collection route for two bin types, automated side loader (ASL) and front-end loader (FEL), was utilized to create appropriate computer vision datasets for the task of fully supervised object detection and instance segmentation. Selected state-of-the-art object detection and instance segmentation algorithms were used to investigate their performances on this proprietary dataset. A mean average precision (mAP) score of 0.8 or greater was achieved with each model, reflecting the effectiveness of using computer vision as a tool to automate the process of recording overfilled bins.</p
    corecore